DC_output and wind_velocity.DC_output on vertical scale.
Call:
lm(formula = DC_output ~ wind_velocity, data = windmill)
Residuals:
Min 1Q Median 3Q Max
-0.59869 -0.14099 0.06059 0.17262 0.32184
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.13088 0.12599 1.039 0.31
wind_velocity 0.24115 0.01905 12.659 7.55e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.2361 on 23 degrees of freedom
Multiple R-squared: 0.8745, Adjusted R-squared: 0.869
F-statistic: 160.3 on 1 and 23 DF, p-value: 7.546e-12
broom has these:showing that the R-squared is 87%, and
showing the intercept and slope and their significance.
lm actually fits the regression. Store results in a variable. Then look at the results, eg. via summary or glance/tidy.wind.velocity strongly significant, R-squared (87%) high.lm by adding \(x^2\) to right side of model formula with +:I() necessary because ^ in model formula otherwise means something different (to do with interactions in ANOVA).geom_point);geom_smooth, which draws best-fitting line);DC_output values, joined by lines (with points not shown).geom_line is use the predictions as the y-points to join by lines (from DC.2), instead of the original data points. Without the data and aes in the geom_line, original data points would be joined by lines.Curve clearly fits better than line.
There is a problem with parabolas, which we’ll see later.
Ask engineer, “what should happen as wind velocity increases?”:
Mathematically, asymptote. Straight lines and parabolas don’t have them, but eg. \(y = 1/x\) does: as \(x\) gets bigger, \(y\) approaches zero without reaching it.
What happens to \(y = a + b(1/x)\) as \(x\) gets large?
Fit this, call it asymptote model.
Fitting the model here because we have math to justify it.
wind_velocity we call wind_pace.Pretty straight. Blue actually smooth curve not line:
wind.pace) vs. 2 for parabola model (wind.velocity and its square).wind.pace (unsurprisingly) strongly significant.w2ggplot likes to have one column of \(x\)’s to plot, and one column of \(y\)’s, with another column for distinguishing things.pivot_longer, then plot:DC.output).wind.velocity higher. [1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0
[14] 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5
[27] 14.0 14.5 15.0 15.5 16.0
predict, which requires what to predict for, as data frame. The data frame has to contain values, with matching names, for all explanatory variables in regression(s).wind_velocity.wind_pace (reciprocal of velocity).wv_new with those in:wv_newmy_fitsDC.output between 0 and 3 from asymptote model. Add rectangle to graph around where the data were:wind.velocity, asymptote model behaves reasonably, parabola model does not.wind.velocity goes to zero? Should find DC.output goes to zero as well. Does it?wind.velocity heads to 0, wind.pace heads to \(+\infty\), so DC.output heads to \(−\infty\)!wind.velocity to understand relationship. (Is there a lower asymptote?)DC.output to be zero for small wind.velocity.
Comments
geom_smoothsmooths scatterplot trend. (Trend called “loess”, “Locally weighted least squares” which downweights outliers. Not constrained to be straight.)